Propagating Trust and Distrust to Demote Web Spam
نویسندگان
چکیده
Web spamming describes behavior that attempts to deceive search engine’s ranking algorithms. TrustRank is a recent algorithm that can combat web spam by propagating trust among web pages. However, TrustRank propagates trust among web pages based on the number of outgoing links, which is also how PageRank propagates authority scores among Web pages. This type of propagation may be suited for propagating authority, but it is not optimal for calculating trust scores for demoting spam sites. In this paper, we propose several alternative methods to propagate trust on the web. With experiments on a real web data set, we show that these methods can greatly decrease the number of web spam sites within the top portion of the trust ranking. In addition, we investigate the possibility of propagating distrust among web pages. Experiments show that combining trust and distrust values can demote more spam sites than the sole use of trust values.
منابع مشابه
Propagating Both Trust and Distrust with Target Differentiation for Combating Web Spam
Propagating trust/distrust from a set of seed (good/bad) pages to the entire Web has been widely used to combat Web spam. It has been mentioned that a combined use of good and bad seeds can lead to better results. However, little work has been known to realize this insight successfully. A serious issue of existing algorithms is that trust/distrust is propagated in non-differential ways. However...
متن کاملLink-Based Similarity Search to Fight Web Spam
We investigate the usability of similarity search in fighting Web spam based on the assumption that an unknown spam page is more similar to certain known spam pages than to honest pages. In order to be successful, search engine spam never appears in isolation: we observe link farms and alliances for the sole purpose of search engine ranking manipulation. The artificial nature and strong inside ...
متن کاملA Novel Approach to Propagating Distrust
Trust propagation is a fundamental topic of study in the theory and practice of rankingand recommendation systems on networks. The Page Rank [9] algorithm ranks web pagesby propagating trust throughout a network, and similar algorithms have been designed forrecommendation systems. How might one analogously propagate distrust as well? This is aquestion of practical importance and...
متن کاملA Survey on Web Spam Detection Methods: Taxonomy
Web spam refers to some techniques, which try to manipulate search engine ranking algorithms in order to raise web page position in search engine results. In the best case, spammers encourage viewers to visit their sites, and provide undeserved advertisement gains to the page owner. In the worst case, they use malicious contents in their pages and try to install malware on the victim’s machine....
متن کاملIncorporating Trust into Web Search
The Web today includes many pages intended to deceive search engines, in which content or links are created to attain an unwarranted result ranking. Since the links among web pages are used to calculate authority, ranking systems should take into consideration which pages contain content to be trusted and which do not. In this paper, we assume the existence of a mechanism, such as, but not limi...
متن کامل